Detecting Outliers under Interval Uncertainty: A New Algorithm Based on Constraint Satisfaction

نویسندگان

  • Evgeny Dantsin
  • Alexander Wolpert
  • Martine Ceberio
  • Gang Xiang
  • Vladik Kreinovich
چکیده

In many application areas, it is important to detect outliers. The traditional engineering approach to outlier detection is that we start with some “normal” values x1, . . . , xn, compute the sample average E, the sample standard deviation σ, and then mark a value x as an outlier if x is outside the k0sigma interval [E − k0 · σ,E + k0 · σ] (for some pre-selected parameter k0). In real life, we often have only interval ranges [xi, xi] for the normal values x1, . . . , xn. In this case, we only have intervals of possible values for the bounds L def = E − k0 · σ and U def = E + k0 · σ. We can therefore identify outliers as values that are outside all k0-sigma intervals, i.e., values which are outside the interval [L, U ]. In general, the problem of computing L and U is NP-hard; a polynomial-time algorithm is known for the case when the measurements are sufficiently accurate, i.e., when “narrowed” intervals [ x̃i − 1 + α 2 n ·∆i, x̃i + 1 + α 2 n ·∆i ] – where α = 1/k0 and ∆i def = (xi − xi)/2 is the interval’s half-width – do not intersect with each other. In this paper, we use constraint satisfaction to show that we can efficiently compute L and U under a weaker (and more general) condition that neither of the narrowed intervals is a proper subinterval of another narrowed interval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating the Effectiveness of Integrated Benders Decomposition Algorithm and Epsilon Constraint Method for Multi-Objective Facility Location Problem under Demand Uncertainty

One of the most challenging issues in multi-objective problems is finding Pareto optimal points. This paper describes an algorithm based on Benders Decomposition Algorithm (BDA) which tries to find Pareto solutions. For this aim, a multi-objective facility location allocation model is proposed. In this case, an integrated BDA and epsilon constraint method are proposed and it is shown that how P...

متن کامل

Identification of outliers types in multivariate time series using genetic algorithm

Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...

متن کامل

Three Hybrid Metaheuristic Algorithms for Stochastic Flexible Flow Shop Scheduling Problem with Preventive Maintenance and Budget Constraint

Stochastic flexible flow shop scheduling problem (SFFSSP) is one the main focus of researchers due to the complexity arises from inherent uncertainties and also the difficulty of solving such NP-hard problems. Conventionally, in such problems each machine’s job process time may encounter uncertainty due to their relevant random behaviour. In order to examine such problems more realistically, fi...

متن کامل

A GMDH neural network-based approach to passive robust fault detection using a constraint satisfaction backward test

This paper proposes a new passive robust fault detection scheme using non-linear models that include parameter uncertainty. The nonlinear model considered here is described by a group method of data handling (GMDH) neural network. The problem of passive robust fault detection using models including parameter uncertainty has been mainly addressed by checking if the measured behaviour is inside t...

متن کامل

Computing the Uncertainty of the 8 point Algorithm for Fundamental Matrix Estimation

Fundamental matrix estimation is difficult since it is often based on correspondences that are spoilt by noise and outliers. Outliers must be thrown out via robust statistics, and noise gives uncertainty. In this article we provide a closed-form formula for the uncertainty of the so-called 8 point algorithm, which is a basic tool for fundamental matrix estimation via robust methods. As an appli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005